Skip to content

TYP: use __all__ to signal public API to type checkers #43695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 28, 2021
Merged

TYP: use __all__ to signal public API to type checkers #43695

merged 10 commits into from
Sep 28, 2021

Conversation

twoertwein
Copy link
Member

@twoertwein twoertwein commented Sep 22, 2021

@jreback jreback added the Typing type annotations, mypy/pyright type checking label Sep 22, 2021
@jreback jreback added this to the 1.4 milestone Sep 22, 2021
@jreback
Copy link
Contributor

jreback commented Sep 22, 2021

cc @simonjayhawkins

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ex @jreback comment. This is a precursor to #43672?

@twoertwein
Copy link
Member Author

This is a precursor to #43672?

No, but once we add a py.typed file we could use pyright to make sure that a list of intended to be public functions/attributes are picked up as public by type checkers.

import pandas.testing
import pandas.arrays
from pandas import testing
from pandas import arrays
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why api, testing, and arrays was not imported with "from pandas import"? I changed it to from pandas import to include them in __all__.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be equivalent (ex the name space visiblity).

note that we may not be fully testing this (see in pandas/tests/api/test_api.py)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test for __all__ and found in that process a few un-exported modules.

@twoertwein twoertwein changed the title TYP: redundant imports for public API TYP: use __all__ to signal public API Sep 22, 2021
@twoertwein twoertwein changed the title TYP: use __all__ to signal public API TYP: use __all__ to signal public API to type checkers Sep 22, 2021
assert not extraneous

missing = expected - actual
assert not missing
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert actual == expected would work, but it doesn't create a nice log in pytest.

@twoertwein
Copy link
Member Author

I think the modules I added in the last commit are not meant to be public (compat, core, io, ...)? @simonjayhawkins

If the above modules are not meant to be public, there probably needs to be a discussion which additional classes should be exported in pandas/__init__.py. I think it would currently not be possible for a user to do

def mean(grouper: pd.core.groupby.DataFrameGroupBy) -> pd.DataFrame:
    return group.mean()

without using "private" APIs (pd.core.groupby.DataFrameGroupBy).

@jreback
Copy link
Contributor

jreback commented Sep 23, 2021

none of those are documented as public

we have an issue about deprecating and making these private but hasn't gotten done

don't add them explicitly to any exports

@jreback
Copy link
Contributor

jreback commented Sep 23, 2021

some of these apis should be exported in pd.api as well (eg DataframeGroupby for example)

@twoertwein
Copy link
Member Author

So the only public "top-level sub-packages" are: "api", "arrays", "options", "test", "testing"? They are all imported in pandas/__init__.py.

@twoertwein
Copy link
Member Author

Reading through the documentation, the following should be listed as public in pandas/__init__.py:
pandas.errors: just defines warnings/errors
pandas.plotting: has itself __all__ to define what is public
pandas/testing.py: has itself __all__ to define what is public
pandas/io: contains some public classes but is missing an __all__
pandas/tseries: contains some public classes but is missing an __all__

@twoertwein
Copy link
Member Author

I was trying to expose only meant-to-be-public sub-modules (public = listed in the documentation) in pandas/io/__init__.py and pandas/tseries/__init__.py by importing them and adding them to __all__ (and then repeating this for their sub-modules that do not yet have an __all__). Unfortunately, this leads to circular imports :(

Here is an example how I would pandas/io/__init__.py to look like

# import modules that have public classes/functions
from pandas.io import (
    formats,
    json,
    stata,
)

# and mark only those modules as public
__all__ = ["formats", "json", "stata"]

I think many people assume that everything that doesn't start with an underscore is public. __all__ could be a gentle nudge to remind people what is public. For example, after renaming get_filepath_buffer to _get_filepath_buffer in #37639 it is clear that at least two projects (PEtab-dev/PEtab#493 and fphammerle/freesurfer-stats#15) assumed it to be public even though it is not listed in the documentation.

@jreback
Copy link
Contributor

jreback commented Sep 28, 2021

can you rebase

@jreback
Copy link
Contributor

jreback commented Sep 28, 2021

https://github.com/pandas-dev/pandas/pull/43695/checks?check_run_id=3734385197

hmm maybe need to be more explict in the docs

@twoertwein
Copy link
Member Author

https://github.com/pandas-dev/pandas/pull/43695/checks?check_run_id=3734385197

from pandas import * causes the deprecated models to be imported as they are listed in __all__.

I think there are two options:

  1. don't use from pandas import *
  2. do not add the deprecated modules to __all__

Which one is preferred?

@jreback
Copy link
Contributor

jreback commented Sep 28, 2021

i don't think we should add the deprecated modules to all

@jreback jreback merged commit b892fc6 into pandas-dev:master Sep 28, 2021
@jreback
Copy link
Contributor

jreback commented Sep 28, 2021

thanks @twoertwein very nice!

gasparitiago pushed a commit to gasparitiago/pandas that referenced this pull request Oct 9, 2021
@twoertwein twoertwein deleted the import branch May 26, 2022 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TYP: Redundant imports to define the public API
3 participants